continuous domain
Discrete and Continuous Difference of Submodular Minimization
Orfanides, George, Hoheisel, Tim, Halabi, Marwa El
Submodular functions, defined on continuous or discrete domains, arise in numerous applications. We study the minimization of the difference of two submodular (DS) functions, over both domains, extending prior work restricted to set functions. We show that all functions on discrete domains and all smooth functions on continuous domains are DS. For discrete domains, we observe that DS minimization is equivalent to minimizing the difference of two convex (DC) functions, as in the set function case. We propose a novel variant of the DC Algorithm (DCA) and apply it to the resulting DC Program, obtaining comparable theoretical guarantees as in the set function case. The algorithm can be applied to continuous domains via discretization. Experiments demonstrate that our method outperforms baselines in integer compressive sensing and integer least squares.
- North America > Canada > Quebec > Montreal (0.14)
- North America > United States > Virginia > Arlington County > Arlington (0.04)
- Europe > United Kingdom > Scotland > City of Edinburgh > Edinburgh (0.04)
Export Reviews, Discussions, Author Feedback and Meta-Reviews
First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. The underlying principle of this type of approach is to maintain a Bayesian posterior over dynamics (conditioned on past experienced transitions) and to seek at each time step for the action optimizing the related augmented MDP (on state-history meta-states and related meta-dynamics), which is generally an intractable problem (for exact solving). This contribution relies on two previous ideas, simulation-based search (with root sampling, which avoids updating the belief over dynamics during planning) and value function approximation (which requires introducing proper features for handling histories), combining them to provide a new approach. In addition to this general approach, called BAFA, the authors provide an alternative and more general proof for the validity of root sampling and provide some experimental results. Overall, the paper is well written and clear, it proposes a sound approach, based on known ideas but combining them smartly (especially for the history features, which seems to be the newest part/the core contribution). My major comment is that experiences, if convincing, are not detailed enough to be reproducible (for example, some meta-parameters are not provided, nor discussed).
- Research Report (0.47)
- Summary/Review (0.34)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Greece (0.04)
- Europe > France > Hauts-de-France > Nord > Lille (0.04)
- North America > United States (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
- Asia > Japan > Honshū > Chūbu > Nagano Prefecture > Nagano (0.04)
- Information Technology (0.47)
- Media (0.46)
Inference for determinantal point processes without spectral knowledge
Determinantal point processes (DPPs) are point process models thatnaturally encode diversity between the points of agiven realization, through a positive definite kernel K . DPPs possess desirable properties, such as exactsampling or analyticity of the moments, but learning the parameters ofkernel K through likelihood-based inference is notstraightforward. First, the kernel that appears in thelikelihood is not K, but another kernel L related to K throughan often intractable spectral decomposition. This issue is typically bypassed in machine learning bydirectly parametrizing the kernel L, at the price of someinterpretability of the model parameters. Second, the likelihood has an intractable normalizingconstant, which takes the form of large determinant in the case of aDPP over a finite set of objects, and the form of a Fredholm determinant in thecase of a DPP over a continuous domain.
Inference for determinantal point processes without spectral knowledge
Determinantal point processes (DPPs) are point process models that naturally encode diversity between the points of a given realization, through a positive definite kernel K. DPPs possess desirable properties, such as exact sampling or analyticity of the moments, but learning the parameters of kernel K through likelihood-based inference is not straightforward. First, the kernel that appears in the likelihood is not K, but another kernel L related to K through an often intractable spectral decomposition. This issue is typically bypassed in machine learning by directly parametrizing the kernel L, at the price of some interpretability of the model parameters. We follow this approach here. Second, the likelihood has an intractable normalizing constant, which takes the form of a large determinant in the case of a DPP over a finite set of objects, and the form of a Fredholm determinant in the case of a DPP over a continuous domain. Our main contribution is to derive bounds on the likelihood of a DPP, both for finite and continuous domains. Unlike previous work, our bounds are cheap to evaluate since they do not rely on approximating the spectrum of a large matrix or an operator. Through usual arguments, these bounds thus yield cheap variational inference and moderately expensive exact Markov chain Monte Carlo inference methods for DPPs.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Greece (0.04)
- Europe > France > Hauts-de-France > Nord > Lille (0.04)
Bringing motion taxonomies to continuous domains via GPLVM on hyperbolic manifolds
Jaquier, Noémie, Rozo, Leonel, González-Duque, Miguel, Borovitskiy, Viacheslav, Asfour, Tamim
Human motion taxonomies serve as high-level hierarchical abstractions that classify how humans move and interact with their environment. They have proven useful to analyse grasps, manipulation skills, and whole-body support poses. Despite substantial efforts devoted to design their hierarchy and underlying categories, their use remains limited. This may be attributed to the lack of computational models that fill the gap between the discrete hierarchical structure of the taxonomy and the high-dimensional heterogeneous data associated to its categories. To overcome this problem, we propose to model taxonomy data via hyperbolic embeddings that capture the associated hierarchical structure. We achieve this by formulating a novel Gaussian process hyperbolic latent variable model that incorporates the taxonomy structure through graph-based priors on the latent space and distance-preserving back constraints. We validate our model on three different human motion taxonomies to learn hyperbolic embeddings that faithfully preserve the original graph structure. We show that our model properly encodes unseen data from existing or new taxonomy categories, can be used to generate realistic trajectories between the embeddings, and outperforms its Euclidean and VAE-based counterparts.
- North America > United States (0.14)
- Asia > Japan > Honshū > Chūbu > Nagano Prefecture > Nagano (0.04)
- Europe > Germany > Baden-Württemberg > Karlsruhe Region > Karlsruhe (0.04)
- (3 more...)
- Information Technology > Artificial Intelligence > Robots (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.34)
Manipulating Predictions over Discrete Inputs in Machine Teaching
Wu, Xiaodong, Han, Yufei, Dahrouj, Hayssam, Ni, Jianbing, Liang, Zhenwen, Zhang, Xiangliang
Machine teaching often involves the creation of an optimal (typically minimal) dataset to help a model (referred to as the `student') achieve specific goals given by a teacher. While abundant in the continuous domain, the studies on the effectiveness of machine teaching in the discrete domain are relatively limited. This paper focuses on machine teaching in the discrete domain, specifically on manipulating student models' predictions based on the goals of teachers via changing the training data efficiently. We formulate this task as a combinatorial optimization problem and solve it by proposing an iterative searching algorithm. Our algorithm demonstrates significant numerical merit in the scenarios where a teacher attempts at correcting erroneous predictions to improve the student's models, or maliciously manipulating the model to misclassify some specific samples to the target class aligned with his personal profits. Experimental results show that our proposed algorithm can have superior performance in effectively and efficiently manipulating the predictions of the model, surpassing conventional baselines.
- North America > Canada > Ontario > Kingston (0.04)
- Europe > France (0.04)
- Asia > Middle East > UAE > Sharjah Emirate > Sharjah (0.04)
Simulated Annealing: Rigorous finite-time guarantees for optimization on continuous domains
Simulated annealing is a popular method for approaching the solution of a global optimization problem. Existing results on its performance apply to discrete com- binatorial optimization where the optimization variables can assume only a finite set of possible values. We introduce a new general formulation of simulated an- nealing which allows one to guarantee finite-time performance in the optimiza- tion of functions of continuous variables. The results hold universally for any optimization problem on a bounded domain and establish a connection between simulated annealing and up-to-date theory of convergence of Markov chain Monte Carlo methods on continuous domains. This work is inspired by the concept of finite-time learning with known accuracy and confidence developed in statistical learning theory.
Scaling MPE Inference for Constrained Continuous Markov Random Fields with Consensus Optimization
Probabilistic graphical models are powerful tools for analyzing constrained, continuous domains. However, finding most-probable explanations (MPEs) in these models can be computationally expensive. In this paper, we improve the scalability of MPE inference in a class of graphical models with piecewise-linear and piecewise-quadratic dependencies and linear constraints over continuous domains. We derive algorithms based on a consensus-optimization framework and demonstrate their superior performance over state of the art. We show empirically that in a large-scale voter-preference modeling problem our algorithms scale linearly in the number of dependencies and constraints.